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Abstract: We demonstrate the strength of a coupling derived from a Gaus- 
sian approximation of Zaitsev (1987a) by revisiting two strong approximation 
results for the empirical process of Dudley and Philipp (1983), and using the 
coupling to derive extended and refined versions of them. 



1. Introduction 



Einmahl and Mason [17j pointed out in their Fact 2.2 that the Strassen-Dudley 
theorem (see Theorem 11.6.2 in 111! ) in combination with a special case of Theorem 
1.1 and Example 1.2 of Zaitsev [42| yields the following coupling. Here \-\ N , N > 1, 
denotes the usual Euclidean norm on 



pJY 



Coupling inequality. Let Y\ , . . . , Y n be independent mean zero random vectors in 
R N , N>1, such that for some B > 0, 

\Yi\ N <B, i = l,...,n. 

If (f2, T, P) is rich enough then for each 5 > 0, one can define independent nor- 
mally distributed mean zero random vectors Z\ , . . . , Z n with Zi and Yi having the 
same variance/ covariance matrix for i = 1, . . . , n, such that for universal constants 
d > and C 2 > 0, 



(1.1) 



> S } < CiiV 2 exp 



c 2 s 

N 2 B 



= 1 N ) 

(Actually Einmahl and Mason did not specify the N 2 in Ijl.ip and they applied 
a less precise result in [43( , however their argument is equally valid when based 
upon |42j.) Often in applications, N is allowed to increase with n. This result and 
its variations, when combined with inequalities from empirical and Gaussian pro- 
cesses and from probability on Banach spaces, has recently been shown to be an 
extremely powerful tool to establish a Gaussian approximation to the uniform em- 
pirical process on the d— dimensional cube (Rio 34]), strong approximations for 
the local empirical process (Einmahl and Mason [171]), extreme value results for the 
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Hopfield model (Bovier and Mason Q and Gentz and Lowe [3]), laws of the iter- 
ated logarithm in Banach spaces (Einmahl and Kuelbs [15ip. moderate deviations 
for Banach space valued sums (Einmahl and Kuclbs 16]), and a functional large 
deviation result for the local empirical process (Mason [26|). In this paper we shall 
further demonstrate the strength of (jl.ip by revisiting two strong approximation 
results for the empirical process of Dudley and Philipp [14| , and use (|l.ip to derive 
extended and refined versions of them. 

Dudley and Philipp [14| was a path breaking paper, which introduced a very 
effective technique for obtaining Gaussian approximations to sums of i.i.d. Banach 
space valued random variables. The strong approximation results of theirs, which 
we shall revisit, were derived from a much more general result in their paper. Key 
to this result was their Lemma 2.12, which is a special case of an extension by 
Dchling [8( of a Gaussian approximation in the Prokhorov distance to sums of 



i.i.d. multivariate random vectors due to Yurinskii 4l|. In essence, we shall be 
substituting the application of their Lemma 2.12 by the above coupling inequality 
(|l.ip based upon Zaitsev [12] • We shall also update and streamline the methodology 
by employing inequalities that were not available to Dudley and Philipp, when they 
wrote their paper. 



1.1. The Gaussian approximation and strong approximation problems 

Let us begin by describing the Gaussian approximation problem for the empirical 
process. For a fixed integer n > 1 let X, Xi, . . . , X n be independent and identically 
distributed random variables defined on the same probability space (fi,T, P) and 
taking values in a measurable space {X,A). Denote by E the expectation with 
respect to P of real valued random variables defined on (£l,T) and write P = F x . 
Let M. be the set of all measurable real valued functions on (X,A). In this paper we 
consider the following two processes indexed by a sufficiently small class T C M.. 
First, define the P-empirical process indexed by T to be 

1 ™ 

(1-2) a n (f) = -= V {/(X.) - Ef(X)} , / e T. 

Second, define the P-Brownian bridge G indexed by T to be the mean zero Gaussian 
process with the same covariance function as a n , 

(1.3) (/, h) = cov(G(f),G(h)) = E (f(X)h(Xj) - E (f(X)) E(h(X)), f,geT. 

Under entropy conditions on J- ', the Gaussian process G has a version which is 
almost surely continuous with respect to the intrinsic semi-metric 

(1.4) d P (f, h) = -0E (f(X) - h(X)f, f,g £ J 7 , 

that is, we include dp-continuity in the definition of G. 

Our goal is to show that a version of Xi, . . . , X n and G can be constructed on 
the same underlying probability space (fl, T, P) in such a way that 

(1.5) |K - G||^ = sup M/)-G(/)| 

is very small with high probability, under useful assumptions on T and P. This 
is what we call the Gaussian approximation problem. We shall use our Gaussian 
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approximation results to define on the same probability (O, T, P) a sequence X\, X%, 
. . . , i.i.d. X and a sequence Gi, G2, . . . , i.i.d. G so that with high probability, 

(1.6) n~ 1/2 max 

l<m<n 

is small. This is what we call the strong approximation problem. 
1.2. Basic assumptions 

We shall assume that T satisfies the following boundedness condition (F.i) and 
measurability condition (F.ii). 

(F.i) For some M > 0, for all feT, \\f\\ x = sup xEX \f (x)\ < M/2. 
(F.ii) The class T is point-wise measurable, i.e. there exists a countable subclass 
Too °f F such that we can find for any function f G T a sequence of 
functions {f m } in for which linim^oo f m (x) — fix) for all x G X . 

Assumption (F.i) justifies the finiteness of all the integrals that follow as well as 
the application of the key inequalities. The requirement (F.ii) is imposed to avoid 
using outer probability measures in our statements - see Example 2.3.4 in (38|. 

We intend to compute probability bounds for (|1.5|) holding for any n and some 
fixed M in (F.i) with ensuing constants independent of n. 

2. Entropy approach based on Zaitsev [42] 

We shall require that one of the following two L2-metric entropy conditions (VC) 
and (BR) holds on the class T . These conditions are commonly used in the context 
of weak invariance prin ciples and many examples are available - see e.g. van der 
Vaart and Wellner [38( and Dudley [12|. In this section we shall state our main 
results. We shall prove them in Section 5. 

2.1. L-2-covering numbers 

First we consider polynomially scattered classes T . Let F be an envelope function 
for the class !F, that is, F a measurable function such that \f (x)\ < F (x) for all 
x e X and / G T . Given a probability measure Q on (X, A) endow M. with the 
semi-metric oIq, where dq(f,h) = f x (f — h) 2 dQ . Further, for any / € M. set 
Q{f 2 ) — dg(f, 0) = J x f 2 dQ. For any e > and probability measure Q denote by 
N(e : T,do) the minimal number of balls {f E M : dq(f,h) < e} of cf^-radius e 
and center h € M needed to cover T . The uniform L2-covering number is defined 
to be 

(2.1) N F (e,T) = sup N(sy/Q{F 2 ),T,d Q ) , 

where the supremum is taken over all probability measures Q on [X ', A) for which 
< Q(F 2 ) < 00. A class of functions T satisfying the following uniform entropy 
condition will be called a VC class. 
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(VC) Assume that for some cq > 0, Uq > 0, and envelope function F , 

(2.2) N F {e,T) < cae-" , < e < 1. 

The name "VC class" is given to this condition in recognition to Vapnik and 
Cervonenkis [39[ who introduced a condition on classes of sets, which implies (VC). 
In the sequel we shall assume that F := M/2 as in (F.i). 



Proposition 1. Under (F.i), (F.ii) and (VC) with F := M/2 for each A > 1 there 
exists a p (A) > such that for each integer n > 1 one can construct on the same 
probability space random vectors X\, . . . , X n i.i.d. X and a version of G such that 

(2.3) P{||a„-G||^> /9 (A)n- ri (logn) r2 } < n'\ 

where n = 1/(2 + 5v ) and r 2 = (4 + 5u )/(4 + 10u ). 

Proposition 1 leads to the following strong approximation result. It is an indexed 
by functions generalization of an indexed by sets result given in Theorem 7.4 of 
Dudley and Philipp [141 ]. 

Theorem 1. Under the assumptions and notation of Proposition 1 for all 1/ 
(2ti) < a < 1/ti and 7 > there exist a p{a, r f) > 0, a sequence of i.i.d. 
X\, X2, . . • , and a sequence of independent copies Gi, G2, . . . , of G sitting on the 
same probability space such that 



(2.4) P- 



max 

Km<n 



G, 



> Cp(a, 7 ) n 1 ' 2 -^^ (logn) 7 



and 



max 

1< m < n 



(2.5) 

where t (a) = (aT% 



= o 



(y/2-r(a) ( logn )-^ 



a.s., 



1/2) /(1 + a) > 0. 



2.2. Bracketing numbers 

A second way to measure the size of the class J- is to use I/2(-P)-brackets instead of 
i2(Q)-balls. Let I £ M. and u S M. be such that I < u and dp (l,u) < e. The pair 
of functions I, u form an e-bracket [/, u] consisting of all the functions / € T such 
that I < f < u. Let Nt 1 (e, J 7 , dp) be the minimum number of e-brackets needed to 
cover T . Notice that trivially we have N(e, T, dp) < iVj ] (e/2, JF, dp). 

(BR) j4ssttme i/iai /or some 60 > and < ro < 1, 

(2.6) log N {] (e, J 7 , dp) <b 2 e- 2r ", < e < 1. 

We derive the following rate of Gaussian approximation assuming an exponen- 
tially scattered index class .F, meaning that (|2.6p holds. Note that we get a slower 
rate in Proposition 2 than that given Proposition 1. 
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Proposition 2. Under (FA), (F.ii) and (BR) for each A > 1 there exists a p (A) > 
such that for each integer n > 1 one can construct on the same probability space 
random vectors X\, . . . ,X n i.i.d. X and a version of G such that 

(2.7) P{||a, i -G||^>p(A)(logn)- K } <n- A , 

where k = (1 — ro)/2ro. 

Proposition 2 leads to the following indexed by functions generalization of an 
indexed by sets result given in Theorem 7.1 of Dudley and Philipp 14 1. 



Theorem 2. Under the assumptions and notation of Proposition 2, with k < 1/2 
(1/2 < ro < 1), for every H > there exist p(r,H) > and a sequence of i.i.d. 



X\,X2, ■ ■ ■ , and a sequence of independent copies Gi,G2, • 
same probability space such that 



of G sitting on the 



(2.8) 

and 

(2.9) 



max 

Km<n 



max 

Km<n 



ma. 



i=l 



> y/Kp (r, H) (log 



< (logn) 



—H 



ma m 



m 

E 



= O (V^logj 



a.s., 



T 



where r = k (1/2 — k) J (1 — n) . 



3. Comments on the approach based on KMT 

Given T , the rates obtained in Proposition 1 and Theorem 1 are universal in P. 
If one specializes to particular P, the rates in Propositions 1 and 2 and Theorem 
1 and 2 are far from being optimal. In such situations one can get better and 
even unimprovable rates by replacing the use of Zaitsev [421 ] by the Komlos, Major 
and Tusnady [KMT] 22] Brownian bridge approximation to the uniform empirical 
process or one based on the same d yad ic scheme. (More details about this approx- 
imation are provided in 0, 13, 25, 27, [28j].) This is especially the case when the 
underlying probability measure P is smooth. To see how this works in the empirical 
process indexed by functions setup refer to Koltchinskii 21 [ and Rio|33j and in 
the indexed by smooth sets situation turn to Revesz [32j and Massart |29j. One can 
also use the KMT-type bivariate Brownian bridge approximation to the bivariate 
uniform empirical process as a basis for further approximation. For a brief outline 
of this approximation consult Tusnady [36| and for detailed presentations refer to 
Castelle [5j and Castelle and Laurent-Bonvalot [6(. 



4. Tools needed in proofs 

For convenience we shall collect here the basic tools we shall need in our proofs. 
4-1- Inequalities for empirical processes 



On a rich enough probability space (il, T, 
variables with law P — P x and ei, £2, . . . , e r , 



, let X, X\,Xi. ■ ■ ■ , X n be i.i.d. random 
be i.i.d. Rademacher random variables 
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independent of X\, . . . , X n . By a Rademacher random variable e\, we mean that 
P(ei = 1) = P(ei = —1) = 1/2. Consider a point-wise measurable class Q of 
bounded measurable real valued functions on (X,A). 

The following exponential inequality is due to Talagrand [HI . 

Talagrand's inequality. // Q satisfies (F.i) and (F.ii) then for all n > 1 and 

t > we have, for suitable finite constants A > and A\ > 0, 



(4.1) 



\a n \\g > A ^E ^ 
, Ait 2 

< 2 exp 



1 " 



+ t 



2 exp 



A^y/n 
M 



where o 2 Q := sup geg Var(g(X)). 

Moreover the constants A and A\ are independent of Q and M. Next we state 
two upper bounds for the above expectation of the supremum of the symmetrized 
empirical process. 

We shall require two moment bounds. The first is due to Einmahl and Mason 
[l§ | - for a similar bound refer to Gine and Guillou [20| . 

Moment inequality for (VC). Let Q satisfy (F.i) and (F.ii) with envelope func- 
tion G and be such that for some positive constants [3, v, c > 1 and a < l/(8c) the 
following four conditions hold, 

E(G 2 pf)) < f3 2 ; N G {e,G) < ce~ v , < e < 1; su P E(<? 2 (X)) < a 2 ; 

960 



sup H^IIa- 



< 



vW7bg03\7T7^) 



2V^+T 

Then we have for a universal constant Ai not depending on [3, 



(4.2) 



1 ™ 
^$>ff(^) 



< A 2V / va 2 log(/3V 1/ct). 



Next we state a moment inequality under (BR). For any < a < l, set 
(4.3) 



J(<r,G) 



[0,<x] 



log N [] (s,g,d P )ds 



and 
(4.4) 



a(a,G) = 



The second moment bound follows from Lemma 19.34 in [37| and a standard sym- 
metrization inequality, and is reformulated by using 



Moment inequality for (BR). Let Q satisfy (F.i) and (F.ii) with envelope G 
and be such that sup gg g E (g 2 (X)) < a 2 < 1. We have, for a universal constant 
A 3 , 



(4.5) E 



(ll^sHL) 



< 



A 3 ( J{a,g) + Vn~P{G(X) > Vn a(a,Q)}) . 
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4-2. Inequalities for Gaussian processes 

Let Z be a separable mean zero Gaussian process on a probability space (fi, T, 1 
indexed by a set T. Define the intrinsic semi-metric p on T by 



(4.6) 



p(s,t) = \Je (Z t -Z s ) : 



For each e > let N (e, T, p) denote the minimal number of p-balls of radius e 
needed to cover T. Write ||Z|| T = sup tgT |Z t | and u\ (Z) = sup tGT E (Z;?) . The 
following large deviation probability estimate for ||Z|| T is due to Borell [2j. (Also 
see Proposition A. 2.1 in [381].) 



Borell's inequality. For all t > 0, 

(4.7) 



{|||Z|| r -E(||Z|| r )|>t}<2«p 



2o-2 (Z) 



According to Dudley [9|], the entropy condition 



(4.8) 



/ y/log N(e,T, p)de < oo 
J [OA] 



ensures the existence of a separable, bounded, dp-uniformly continuous modifica- 
tion of Z. Moreover the above Dudley integral (|4.8|) controls the modulus of conti- 
nuity of Z (see Dudley as well as its expectation (see Marcus and Pisier (24| . 
p. 25, Ledoux and Talagrand (23|, p. 300, de la Pena and Gine 0], Cor. 5.1.6, and 
Dudley [13]). The following inequality is part of Corollary 2.2.8 in van der Vaart 
and Wellner 

Gaussian moment inequality. For some universal constant A4 > and all a > 

we have 



(4.9) 



E sup |Z t - 

\p(s,t)<a 



\)<A 4 [ ^/logN(e,T,p)de. 
J J lo,<y] 



We shall be applying these inequalities to the Gaussian process Z = G defined 
in introduction, so that T = T and p = dp. 



4-3. A maximal inequality 

The following version of a maximal inequality due to Montgomery-Smith [30l ] (see 
also Theorem 1.1.5 in Q) will come in handy. 

A maximal inequality. Let X\, . . . , X n , n> 1, be i.i.d. random variables taking 
values in a separable Banach space. Then for all t > 0, 



(4.10) 



max 

1 < m < n 



5> 



> t > < 91 



> 



30 
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5. Proofs of main results 

5.1. Description of construction of (a n ,G) 

Under (F.i), (F.ii) and either (VC) or (BR) for any e > we can choose a grid 

H (e) = {h k : 1 < k < N (e)} 

of measurable functions on (X, A) such that each / e T is in a ball {/ € M. : 
dp(hk, f) < s} around some hk, 1 < k < N (e). The choice 

(5.1) N(e) < N{e/2,T,d P ) 

permits us to select hk <G T. Set 

T{e) = {(/,/') G^ 2 : ^ (/,/')< 4- 

Fix n > 1. Let X, Xl,...,X„ be independent with common law P = P x and 
ei,...,e„ be independent Rademacher random variables mutually independent of 
Xi, . . . , X n . Write for e > 0, 



u n (e) = E < sup 

(/,/')e^(e) 



n ^ 

^E^/-/')(^) 



and 



l*{e)=E\ sup |G(/)-G(/')| 



Given e > and n > 1, our aim is to construct a probability space (Q, T, P) on 
which sit Xi , . . . , X„ and a version of the Gaussian process G indexed by T such 
that for H (e) and J 7 (e) defined as above and for all A > 0, 5 > and t > 0, 

- G||^ > A Mn (e) + /x (e) + 5 + (A + 1) t} 

<pi max |a„ (ft) - G(ft)| > <5 i 

(5.2) +pJ sup |a„ (/) - a n (f )\ > (e) + At \ 

{(fJ')er(s) J 

+ p{ sup |G(/)-G(/')|>* + /i(e)l 
[(/J')e^(e) J 

=: P„ (<5) + Q„ (t,e)+Q(i,e), 

with all these probabilities simultaneously small for suitably chosen A > 0, 8 > 
and t > 0. Consider the n i.i.d. mean zero random vectors in 

Yi := (fti (Xi) - E(/n (X)), . . . , ftjv^) (XA - E(h N{e) (X))) , 1 < i < n. 

First note that by hk € and (F.i), we have 

l*i|jV(e) < M 



iV( £ ) 



, 1 < i < n. 
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Therefore by the coupling inequality (jl.ip we can define Y\, . . . ,Y n i.i.d. 

Y:= (Y 1 ,...^^ 

and Zi, . . . , Z n i.i.d. 

Z:= (z\...,Z N & 
mean zero Gaussian vectors on the same probability space such that 

C 2 y/n S 



(5.3) P n (S)< 



>S } < Ci7V(e) 2 exp 



N(e) 



(N (e)) 5/2 M / 



where cov(Z l ^Z k ) = cov(Y l ,Y k ) — (hi,hk). Moreover by Lemma Al of Berkes 
and Philipp [l( (also see Vorob'ev |40() this space can be extended to include a 
P-Brownian bridge G indexed by T such that 



n 

k 



G(h k )=n- 1 / 2 Y,Z, 



i=l 

The P n (S) in (|5-2[) is defined through this G. Notice that the probability space on 
which Yi, . . . , Y n , Zi, . . . , Z n and G sit depends on n > 1 and the choice of e > 
and S > 0. 

Observe that the class 

0(e) = {/-/' :(/,/') €^(e)} 
satisfies (F.i) with M/2 replaced by M, (F.ii) and 



tjg w = sup Far(/(X) - /'(X)) < sup d 2 P (f,f)<e 2 . 



if,f')er(e) (/,/')e^(e) 
Thus with A > as in (|4.ip we get by applying Talagrand's inequality, 

rt -,x Qn(t,e) = P{||an||o( e ) > A(jjL n (e) + t)} 

<2e X p(-^j + 2e X p(-^- 

Next, consider the separable centered Gaussian process %(fj>) = G(/) — G(/') 
indexed by T = T (e). We have 

4(Z)= sup E((G(/)-G(/')) 2 ) = sup Var(f(X)-f(X)) 

(fj')^(e) (/./')e^(e) 

< SUp 4 (/,/') <£ 2 . 

(/J')G^(e) 

Borell's inequality (|4.7p now gives 

t 2 ' 



(5.5) Q(t,e) = F< sup |G(/) - G(/')| > t + fi(e) } < 2exp 



2s 
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Putting (|5.3[) , (|5.4p and (|5.5[) together we obtain, for some positive constants A, A\ 
and A 5 with A 5 < 1/2, 

P{ ||a„ - G\\jr > Afi n (e) + fi (e) + <5 + (A + 1) t} 

(5.6) <C^( £ ) 2 expf C ^. S 2 ) 

( A x Jnt\ ( A 5 t 

+ 2 exp y / + ° XP \ 

Proof of Proposition 1. Let us assume that (VC) holds with F := M/2, so that for 
some c > and v Q > 0, with c\ = c (2v / PF 2 Y° = c M v ° , 

N (e) < N(e/2, T, d P ) < c^ , < e < 1. 

Notice that both 

N(e,g(e),d P ) < (N(e/2,T,d P )) 2 < c 2 e- 2v » 

and 

N{e,F{e),d P ) < {N{e/2,T 1 d P )) 2 < c\e- 2va . 

Therefore we can apply the moment bound assuming (VC) given in 1)4. 2p taken 
with Q = g(e), G := M, v = 2u and f3 = M, to get for any < e < 1/e and n > 1 
so that 

(5.7) - ^ = > M 

2Vl + 2i/oV lo g( Mvl /e) 

the bound 

M« (e) < A 2 eV2^ log(MV 1/e). 

Whereas, by the Gaussian moment bound (14.91) . we have for all < e < 1/e, 



fj, (e) < A 4 V2vq / \/log(l/x)da;. 
V[0,e] 

Hence, for some D > it holds for all < e < 1/e and n > 1 so that (|5.7[) holds, 



(5.8) (e) + M (e) < r>£A/log(l/e). 

Therefore, in view of (|5.8p and (|5.6p it is natural to define for suitably large positive 
7i and 72, 

6 = 7i£v / l°g(l/ £ ) an d * = 72£\A>g (1/e). 

We now have for all < e < 1/e and n > 1 so that (|5.7p is satisfied on a suitable 
probability space depending on n > 1, e and 5 so that (|5.6|) holds, 



1 {||a„ - GUjr > (D + 71 + (1 + A) 72 ) e Vlog (1/e)} 



2 exp ( -^g^ e VloiOT^) ) + 4cxp (-A s72 2 log (1/e)) 
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By taking e — ((logn)/n) 1 /( 2+5l/0 -', which satisfies (|5 . T[) for all large enough n, we 
readily obtain from these last bounds that for every A > 1 there exist D > 0, 71 > 
and 72 > such that for all n > 1, a n and G can be defined on the same probability 
space so that 



P < IK - Gil, > in + * + (i + A) 72 > ,'i5») 1/l2+5 ""' /^U, 



-A 



It is clear now that there exists a p(X) > such that ()2.3|1 holds. This completes 
the proof of Proposition 1. □ 

Proof of Proposition 2. Under (BR) as defined in (|2.6[) we have, for some < ro < 1 
and bo > 0, 

N(s)< N{e/2,T,dp)<N [] {e/2,T,dp)< e * V (—^ ), < e < 1, 



and as above both 



iV( £) g( £ ),dp)<iV [] (e,a(e),dp) < (JV [] (e/2,^ ) d P )) 2 < exp (^2 



£ 2r 



and 



iV( £ , ^(e), dp) < JV[ ] (e, ^(e), d P ) < (jVj j (e/2, ^, d P )) 2 < exp ^2 
Setting a = e in p~3]) and (|44|) we get 

J(eMe))<V2 b0 [ %<^-r« 

JlO.e] S " 1 - r 



P 2r 



and 



£ l+r 



^/\ogN [] {e,g(e),d P ) ~ V2b ' 
Hence by the moment bound assuming (BR) given in (|4.5p taken with G (X) = M, 



fJ-n (e) < ^3 



\/26 !_ 



1 - r 



and, since in the same way we have 



1-r-o ' y26 ' 

we get by the Gaussian moment inequality, 

/u(s) < e 1 r °. 

As a consequence, for some D > and 

(I?M) 1/(1+ro) 



£ > 



n l/(2+2r ) 
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it follows that 

Afi n (e)+fi(e) <De x - ro . 
Thus it is natural to take in (|5.6p for some 71 > and 72 > large enough, 

5 = j 1 e 1 - ro and i = 72 £ 1 - r <\ 

which gives with p = D + 71 + ( A + 1 ) 72 , 

F{||«„ - > pe 1 -""} 

+ 2 e x p(-^ e -) + 4»p(-4||). 

We choose 

£ = 

which makes 



/10fo 2 2 2 



\ logn 



cxp ^-^i^; =n 

Given any A > we clearly see now from this last probability bound that for 
p (A) > made large enough by increasing 71 and 72 we get for all n > 1, 

P{||a„ -Glljr > P(A) (logn)-( 1 - r °)/ 2ro } < n~\ 
The proof of Proposition 2 now follows the same lines as that of Proposition 1. □ 



5. 2. Proofs of strong approximations 



Notice that the conditions on T in Propositions 1 and 2 imply that there exists a 
constant B such that 

supE( -LyetfiX,) ) <BandE(||G|y < B. 

Therefore by Talagrand's inequality (|4.1|) and the Montgomery-Smith inequality 
(|4.10p for all n > 1 and t > we have, for suitable finite constants C > and 
Ci >0, 



(5.9) 



J max Vm||Q! m ||^ > Cy/n(B + t) 

1 1 < m < n 



< 18exp 



18 exp - 



M 



where cr|r := sup^ gJF T^ar(/(X)). Furthermore, by Borell's inequality (|4.7p . the 
Montgomery-Smith inequality (|4.10p and the fact that n -1 / 2 X)™=i ^ =d ^> f° r 
i.i.d. Gj, we get for all n > 1 and i > that for a suitable finite constant D > 0, 



(5.10) 



max 

Km<n 



> D^{B + t) \ < 18 exp 



24, 
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Proof of Theorem 1. Choose any 7 > 0. We shall modify the scheme described on 
pages 236-238 of Philipp [3l[ to construct a probability space on which (|2.4p and 
()2.5|) hold. Let uq = 1 and for each k > 1 set — [k a ], where [x] denotes the 
integer part of x and a is chosen so that 



(5.11) 



1/2 < na < 1. 



Notice that t\ < 1/2 in Proposition 1 and thus a > 1. 

Applying Proposition 1, we see that for each A > 1 there exists a p — p (A) > 

such that one can construct a sequence of independent pairs {ot n k k \ ^ k ^)k>i sitting 
on the same probability space satisfying for all k > 1, 



(5.12) 

Set for k > 1 



{|| a W-G«||^ >pn -n (lognfef 2 } 



< n 



-A 



1 



j<k 



1 + a 



-k a+1 . 



Using Lemma Al of Berkes and Philipp [l[ we can assume that each a n k k is formed 
from X tk+ i, . . . , X tk i.i.d. X and that each is formed as 



G< fc > = 



Ink 



E G 

*fe<j<tfe+i 



where &t k +i, ■ ■ ■ ■ ^t k+1 are i.i.d. G. Moreover we can do this in such a way that 
X\,X2 ■ ■ ■ , are i.i.d. X and Gi,G2, . . . , are i.i.d. G. For any integer N > 2 set 
N (f3) = [N 13 ] , where /3 = a/ (1 + a). Define 



N 



s(N)= n k /2 ^ (\ogn k y 

k=N(0) 

Now for some constants c\ > and c > 0, 

(5.13) S (N) ~ CliV (l+«)/2-(- 1 -l/2) ( logiV )^ _ c ( log ^ 

where r (a) = (ar a - 1/2) /(l + a) > 0, by (j5~TTj) . 
We have 



max 

Km<h 



^[/(A,)-E/(A)-G,(/)] 



3=1 



> ps(A^) 



< 



1< m < £ a 



J2[f(Xj)-Ef(X)] 



7=1 



> 



max 

Km<t 



N(/3) 



3=1 



> 



E ' 

k=N((3) 



max 

{. +l<m<t. 



£ [/(^)-E/(X)] 

3=* i .+ 1 



> 
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JV-1 

E ' 

k=N(f3) 



max 

t.+l<ra<t. 



E G i(/) 



> 



ps(AT) 



+P < max 

N{p)<j<N 



,(*) 



> 



j 7 

ps(AT) 



=:^^(p,iV). 



fc=JV(/3) 

It is easy to show using inequalities (|5.9p and (|5.10[) , along with the choice of 
1/2 < /3 = a/(l + a) < 1, that for any 7 > for all large enough p, 

2 

(5.14) P * G°> 7V ) ^ *J\rV 4 > for a11 ^ > !• 

i=l 

For instance, consider Pi (p, TV). Observe that 

PiQ>,N)<fI max Vm| |a m | \r > CJt^(B + t n )\ , 

ll<m<t N (p) J 
> (AT) 



where 



TN 



Now v/?iv08) ~ c 2 N a / 2 for some c 2 > 0. Therefore by (|5. 13[) for some c 3 > 0, 

TiV ~ CgiV 1 -™ (log ATf 2 . 

Since by (|5. 1 1|) we have 1 — T\a > 0, we readily get from inequality (|5.9| that 
for any 7 > and all large enough p, Pi (p,N) < tjf /8, for all N > 1. In the 
same way we get using inequality (|5.10p that for any 7 > and all large enough p, 
P2 (ft, N) < for all N > 1. Hence we have ([BTTg]) . 

In a similar fashion one can verify that for any 7 > and all large enough p, 



(5.15) 



E P * N ) ^ **7 4 ' for all N > 1. 



i=3 



To see this, notice that 



and 



ft(p,JV0<JVP{ max V^||a m ||^ >ps(^)/8l 

I 1 <rn<njv J 



P 4 (p, A 7 ") < ATP < max || ]T G j (/) ||^ > p.s (A 7 ) /8 I . 



3=1 



Since ^/n^ ~ A 7 "/ 2 and A 7 ~ C3i]^ a+1 ^ for some c 3 > 0, we get f|5.15[) by proceeding 
as above using inequalities (|5.9p and (|5. 10[) . 
Next, recalling the definition of s (N), we get 



p 5 (p,n)<f\ 

k=N(p) 



ps(A0 
T 4 



N 

* E ' 

k=N(f3) 



>n k a 



(fc) 



1/2-Ti /, w 

pn fc ' (logn fc ) 



> 
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which by (|5. 12[) for any A > and p — p (a, A) > large enough is 



V ( V ' ") , for all N > 1, 



which, in turn, for large enough A > is < tpp [2. Thus for all 7 > there exists a 
p > so that 

5 

^Pi{p,N) <^, for aUJV>l. 

i=l 

Since a can be any number satisfying 1/2 < t\ol < 1 and ijv+i/<Ar — > 1, this 
implies (|2.4|1 for p = p (a, A) large enough. The almost sure statement (|2.5|) follows 
trivially from (|2.4|) using a simple blocking and the Borel-Cantelli lemma on the 
just constructed probability space. This proves Theorem 1. □ 

Proof of Theorem 2. The proof follows along the same lines as that of Theorem 1. 
Therefore for the sake of brevity we shall only outline the proof. Here we borrow 
ideas from the proof of Theorem 6.2 of Dudley and Philipp [3]. Recall that in 
Theorem 2 we assume that 1/2 < ro < 1 in Proposition 2, which means that 
< k := (1 - r )/2r < 1/2. For k > 1 set 

(5.16) tk = [cxp (fc 1 ~ K )] and rik = tk — tfc-i, where to = 1. 

Now for some b > we get rik ~ b 2 k~ K tk, 

Jnk by/tk b^/tk 



(logn fe ) K fc«(i-«)+«/2 fcK+e' 

where 9 = k (| - At) > 0. Choose < < 1 and set N {(3) = [TV ] . Using an 
integral approximation we get for suitable constants c\ > and c 2 > 0, for all large 
N 

( j ^ " " C"ogn*)- - ^ -(tog^))^-)- 

Also for all large N, 

(5.18) s (JV) /Vn^ > |tjV"/ 2 - K (3-") = : c A^ 2 . 
For later use note that for any < (3 < 1 and £ > 

(5.19) 00, as AT ^00, 
and observe that 

(5.20) t N+ i/t N -> 1, as iV-> 00. 

Constructing a probability space and defining Pj (p, AT), i = 1, . . . , 5, as in the proof 
of Theorem 1, but with rik, tk and s (N) as given in (|5. 16[) and (|5.17|) the proof 
now goes much like that of Theorem 1. In particular, using inequalities (|5.9|) and 
(|5.10[) . and noting that N ~ (log^Ar)) 1 ^ 1- , one can check that for some v > 0, 
for all large enough N, 



^F 4 (p,7V)<exp(- (log (**))") 
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and by arguing as in the proof of Theorem 1, but now using Proposition 2, we easily 
see that for every H > there is a probability space on which sit i.i.d. X\,X%..., 
and i.i.d. Gi, G2, ■ ■ ■ , and a p > such that 



P 5 (p,N)< (log (t N )) H \ for all TV >1 
Since for all H > 0, 

log (t N f (exp (- (log (t N )) u ) + (log M)-^ 1 



0, as N — > co, 



this in combination with ()5.17j) and (15.20|) proves that (|2.8|) holds with r = 9/ (1 — n) 
and p (t, H) large enough. A simple blocking argument shows that (|2.9j) follows from 
(jZHJ) , Choose if > 1 in (|2~8]) . Notice that for any fc > 1, 



\/mam - ^ G, 

m 



( u 1 


r 

max 

1 < rn < n 


(_2 fc <n<2 fc + 1 





> V2np(r, F) (logn)" T 



< 



max 

Km<2 fc + 1 



ma 



i=l 



> V^pir^H) (log2 fc+1 )- T 



< ((fc + l)log2)- 
Hence (|2.9p holds by the Borel-Cantelli lemma. 



□ 
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